This notebook contains analysis of mRNA, microRNA and DNAm data. mRNA data is previously unpublished, although MiR and DNAm data is: MiR: https://pubmed.ncbi.nlm.nih.gov/30783190/ DNAm: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7554960/ This is the first notebook that generates results required for further analysis so it should be ran first. The information about libraries and versions used while writing this notebook are provided in the bottom via sessionInfo() function.
The aim in this project is to make an multi-omics integrative analysis of HNSC tumor samples. High-throughput sequencing of mRNA and microRNA was preformed. Exploratory data analysis and differential expression analysis was preformed according to the DESeq2 vignette. DNAm data was generated Illumina Infinium Methylation EPIC BeadChip that contains ~900k CpG’s.
library(rmarkdown)
library(BiocManager)
library(GenomicFeatures)
library(stringr)
library(tximport)
library(DESeq2)
library(pheatmap)
library(RColorBrewer)
library(ggplot2)
library(ggrepel)
library(gprofiler2)
library(RnBeads)
library(RnBeads.hg19)
library(readxl)
library(plyranges)
source(file = file.path(getwd(), "scripts", "idMap.R"))
source(file = file.path(getwd(), "scripts", "triangleHeatMap.R"))
In this chapter Data analysis of mRNA data is preformed.
Raw data was provided by Sabol and dowloaded from cloud:
https://irbhr-my.sharepoint.com/:f:/g/personal/ivan_sabol_irb_hr/Eq5MRE0xi8pNuNOuwWQ-vb0BT6MaSxwmp97PKUF97YRoxg?e=gA69Fb
Data demultiplexing was preformed with script:
Building index with: For RNA seq mapping index is built from the gencode transcirpt data from hg19 assembly: https://www.gencodegenes.org/human/release_19.html The bash call for salmon to build the index is this:
salmon index -t gencode.v19.pc_transcripts.fa.gz -i homosapiens_hg19_gencode_transcripts
Mapping with Salmon. Mapping was done on a cluster with salmon tool. the Folder has to contain folder with data (mRNA_data) and the index file that was built. Data is saved in individual folders that are named OV401, OV402, … OV413.
for fn in mRNA_data/OV{401..413};
do
samp=`basename ${fn}`
echo "Processing sample ${samp}"
salmon quant -i homosapiens_hg19_gencode_transcripts -l A -p 6 --validateMappings --gcBias --numGibbsSamples 20 -o quants/${samp}_quant -1 ${fn}/${samp}_1.fastq.gz -2 ${fn}/${samp}_2.fastq.gz
done
#reading coldata
dir <- file.path(getwd(), "inputs")
mRNA_coldata <-
read.csv(file.path(dir, "mRNA_sample_table.csv"),
row.names=1, sep=",",
stringsAsFactors=FALSE)
#adding file path of quants for tximport
mRNA_coldata$files <-
file.path(dir,
"mRNA_quants_hg19_new",
paste0(mRNA_coldata$names,"_quant"),
"quant.sf")
#Check if everything is okay
file.exists(mRNA_coldata$files)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
Tximport
This function maps transcript id’s to gene id’s. Firstly we make an txdb object from the gff3 file. Then build an tx2gene object which contains ensembl gene ids and txids from salmon. After tx2gene object is built we input it in the tximport function. Next we can build an DESeq object wiht DESeqDataSetFromTximport function. gff3 file - https://www.gencodegenes.org/human/release_19.html
#fetching database for mapping gene id to transcript name
txdb <- makeTxDbFromGFF(file.path(dir, "gencode.v19.annotation.gff3.gz"))
k <- keys(txdb, keytype = "TXNAME")
tx2gene <- AnnotationDbi::select(txdb,k,"GENEID", "TXNAME")
saveRDS(tx2gene, file = file.path(getwd(), "outputs/RDS/tx2gene.rds"))
Separating these two chunks to make the code faster
tx2gene <- readRDS(file.path(getwd(), "outputs/RDS/tx2gene.rds"))
#parsing txnames
tx2gene$TXNAME <- str_split(tx2gene$TXNAME,pattern = "\\.",
simplify =T)[,1]
# importing quantification data
txi <- tximport(mRNA_coldata$files,
type="salmon",
tx2gene=tx2gene,
ignoreTxVersion=T,
ignoreAfterBar=T)
Building dds object
dds (DeseqDataSet) objects are custom DESeq class of SummarizedExperiments class from Bioconductor. Also making a minimum filter to filter out
# building mRNA dds object
mRNA_dds <-
DESeqDataSetFromTximport(txi, mRNA_coldata, design= ~ Hist + Tissue + HPV)
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
# trimming rownames to remove the suffix
rownames(mRNA_dds) <- str_split(rownames(mRNA_dds),"\\.", simplify = T)[,1]
# adding colnames
colnames(mRNA_dds) <- mRNA_coldata$Sample.ID
# filtering rows low counts
keep <- rowSums(counts(mRNA_dds)) > 10
mRNA_dds <- mRNA_dds[keep,]
dim(mRNA_dds)
## [1] 18104 13
head(counts(mRNA_dds))
## KBD2 KBD3 KBD10 KBD11 KBD13 KBD15 KBD16 KBD17 KBD18 KBD21 RNA3T
## ENSG00000000003 150 642 325 160 310 694 397 398 857 404 167
## ENSG00000000005 0 1 0 0 3 0 1 0 0 1 7
## ENSG00000000419 118 611 200 486 392 589 478 392 258 595 275
## ENSG00000000457 106 241 192 169 218 278 199 301 289 315 311
## ENSG00000000460 131 305 133 132 107 490 248 344 177 372 131
## ENSG00000000938 120 226 283 228 762 154 243 164 331 374 541
## RNA10T RNA16T
## ENSG00000000003 206 131
## ENSG00000000005 0 23
## ENSG00000000419 337 102
## ENSG00000000457 491 107
## ENSG00000000460 213 60
## ENSG00000000938 1002 34
To do an inter sample exploratory analysis, a statistical method for stabilizing variance is preformed.
mRNA_rld <- DESeq2::rlog(mRNA_dds)
PCA plot
PCA plot converts the correlations (or lack there of) among all of the samples into a 2-D graph. Highly correlated samples cluster together. It is a version of a dimension reduction method. Also, the axis are ordered by importanc or how much of the variance in the data they represent.
# selecting data for plotting pca
mRNA_pcadata <-
DESeq2::plotPCA(mRNA_rld,
intgroup = c("Hist", "HPV", "Tissue"),
returnData = TRUE)
# selecting variance
percentVar <- round(100 * attr(mRNA_pcadata, "percentVar"))
# plotting pca data
ggplot(mRNA_pcadata, aes(x = PC1, y = PC2, color = HPV, shape = Hist)) +
geom_point(size =3) +
xlab(paste0("PC1: ", percentVar[1], "% variance")) +
ylab(paste0("PC2: ", percentVar[2], "% variance")) +
theme(panel.grid = element_blank(),
panel.background = element_rect(fill = "white"),
panel.border = element_rect(fill = "transparent"))+
coord_fixed() +
ggtitle("PCA with rlog data")+
scale_color_manual(values = c("firebrick1","forestgreen","blue1"))+
geom_text_repel(aes(label = colData(mRNA_rld)$Sample.ID), size =3)
Sample distance To asses overall similarity between samples we calculate sample distance using the R function dist to calculate the Euclidean distance.
Triangle sample distance heatmap
mRNA_annot_row <- as.data.frame(colData(mRNA_rld)[,c("Hist","Tissue", "HPV")])
# Calling the local funciton
triangleHeatMap(assay(mRNA_rld),
annot_row = mRNA_annot_row)
## Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
## Also defined by 'spam'
## Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
## Also defined by 'spam'
The call to DESeq function preforms the estimation of size factors, estimation of dispersion values and fits a generalized linear model.
mRNA_dds <- DESeq(mRNA_dds, betaPrior = T)
Differential expression on cancer control group
mRNA_res <- results(mRNA_dds, contrast = c("Hist","cancer","normal"))
mRNA_res
## log2 fold change (MAP): Hist cancer vs normal
## Wald test p-value: Hist cancer vs normal
## DataFrame with 18104 rows and 6 columns
## baseMean log2FoldChange lfcSE stat pvalue
## <numeric> <numeric> <numeric> <numeric> <numeric>
## ENSG00000000003 364.56985 0.585192 0.618222 0.946572 0.34385672
## ENSG00000000005 5.82988 -3.843846 1.130315 -3.400686 0.00067217
## ENSG00000000419 353.89118 0.523721 0.425025 1.232213 0.21786970
## ENSG00000000457 241.36876 -0.426076 0.479682 -0.888247 0.37440769
## ENSG00000000460 205.29988 0.327594 0.476669 0.687256 0.49192119
## ... ... ... ... ... ...
## ENSG00000273291 1.19174 0.0464801 1.194795 0.0389021 0.968968
## ENSG00000273294 6.89812 1.9531712 1.194128 1.6356458 0.101914
## ENSG00000273331 4.19184 1.5054733 1.192183 1.2627868 0.206666
## ENSG00000273398 21.52493 -0.9066724 1.032461 -0.8781663 0.379853
## ENSG00000273439 13.65698 -0.2950699 0.654476 -0.4508493 0.652098
## padj
## <numeric>
## ENSG00000000003 0.55418382
## ENSG00000000005 0.00930593
## ENSG00000000419 0.42149623
## ENSG00000000457 0.58019653
## ENSG00000000460 0.67826924
## ... ...
## ENSG00000273291 NA
## ENSG00000273294 0.267233
## ENSG00000273331 0.408695
## ENSG00000273398 0.584976
## ENSG00000273439 0.795028
Some summary statistic
summary(mRNA_res, alpha=0.05)
##
## out of 18104 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 1081, 6%
## LFC < 0 (down) : 1448, 8%
## outliers [1] : 1182, 6.5%
## low counts [2] : 350, 1.9%
## (mean count < 1)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
mRNA_DE <- filter(as.data.frame(mRNA_res), padj < 0.05)
summary(mRNA_DE)
## baseMean log2FoldChange lfcSE stat
## Min. : 2.14 Min. :-6.8618 Min. :0.2436 Min. :-10.0581
## 1st Qu.: 67.48 1st Qu.:-2.6141 1st Qu.:0.5311 1st Qu.: -3.5335
## Median : 256.60 Median :-1.5892 Median :0.6630 Median : -2.8231
## Mean : 1325.16 Mean :-0.5309 Mean :0.6841 Mean : -0.5478
## 3rd Qu.: 803.43 3rd Qu.: 1.8801 3rd Qu.:0.8192 3rd Qu.: 3.1543
## Max. :181417.54 Max. : 7.1293 Max. :1.1917 Max. : 8.3316
## pvalue padj
## Min. :0.000e+00 Min. :0.000000
## 1st Qu.:7.652e-05 1st Qu.:0.002003
## Median :8.098e-04 Median :0.010608
## Mean :1.799e-03 Mean :0.015340
## 3rd Qu.:3.017e-03 3rd Qu.:0.026357
## Max. :7.611e-03 Max. :0.049870
hist(mRNA_DE$log2FoldChange, border="white", col="darkgreen")
hist(log(mRNA_DE$baseMean), border="white", col="darkblue")
GSEA on DE mRNA In the next chunk gene set enrichment analysis is preformed on DE mRNA data
# Using only part of sources
GSEA_SOURCES <- c("GO:BP","CORUM","KEGG","REAC","WP","MIRNA", "TF")
# Call on the gost
gost_mRNA <-
gost(rownames(mRNA_DE),
organism= "hsapiens",
exclude_iea=TRUE,
domain_scope="annotated",
ordered_query=F,
sources=GSEA_SOURCES)
# Change query name
gost_mRNA$result$query <- "DE genes vs genome"
#Plot and table of significant results
gostplot(gost_mRNA ,capped = F, interactive = T)
DE genes on background. Results are similar, which leads to conclude that all sequenced data is covering most of the genes.
# Call on the gost
gost_mRNA <-
gost(rownames(mRNA_DE),
organism= "hsapiens",
exclude_iea=TRUE,
domain_scope="custom",
custom_bg=rownames(mRNA_res),
ordered_query=F,
sources=GSEA_SOURCES)
# Change query name
gost_mRNA$result$query <- "DE genes vs sequenced genes"
#Plot and table of significant results
gostplot(gost_mRNA ,capped = F, interactive = T)
HPV effect on cancer In the next chunk explores the effect of HPV on cancer. To be more precise, it explores which genes are DE in HPV in cancer. Firstly compare HPV - and controls to get a list of genes that re DE in cancer samples. Secondly, compare HPV + and HPV - samples and exclude genes from the previous comparison.
# making another dds object with different design
mRNA_dds_HPV <- mRNA_dds
# New column in colData for design
colData(mRNA_dds_HPV)$HistHPV <-
as.factor(paste0(colData(mRNA_dds_HPV)$Hist,colData(mRNA_dds_HPV)$HPV))
# setting the HistHPV des
design(mRNA_dds_HPV) =~ HistHPV
# Running the DESeq function
mRNA_dds_HPV <- DESeq(mRNA_dds_HPV, betaPrior = T)
# Contrasting between negative HPV and controls
res.cN.nN <-
results(mRNA_dds_HPV, contrast=c("HistHPV","cancerN","normalN"), tidy = T)
# Contrasting between HPV negative and positive
res.cP.cN <-
results(mRNA_dds_HPV, contrast=c("HistHPV","cancerP","cancerN"), tidy = T)
# selecting significant results
res.cP.cN <- dplyr::filter(res.cP.cN, padj < 0.1)
res.cN.nN <- dplyr::filter(res.cN.nN, padj < 0.1)
#
mRNA_HPV <- res.cP.cN[!is.element(res.cP.cN$row, res.cN.nN$row),]
mRNA_HPV$symbol <- idMap(mRNA_HPV$row)
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
head(mRNA_HPV)
## row baseMean log2FoldChange lfcSE stat pvalue
## 1 ENSG00000006047 98.943311 2.732856 0.7401323 3.692389 2.221573e-04
## 2 ENSG00000007038 297.657431 3.560497 0.8611900 4.134392 3.558962e-05
## 3 ENSG00000008196 5.563191 3.648879 0.9743384 3.744981 1.804076e-04
## 7 ENSG00000077935 52.302104 3.863428 0.8109130 4.764294 1.895156e-06
## 10 ENSG00000100285 305.883103 5.204755 0.7290451 7.139140 9.391694e-13
## 11 ENSG00000102387 22.342732 4.045055 0.8866892 4.561976 5.067450e-06
## padj symbol
## 1 7.274828e-02 YBX2
## 2 2.387378e-02 PRSS21
## 3 6.137155e-02 TFAP2B
## 7 3.259949e-03 SMC1B
## 10 1.660733e-08 NEFH
## 11 7.467310e-03 TAF7L
GSEA on HPV effect
Gene set analysis on HPV related DE mRNA.
gost_HPV <-
gost(mRNA_HPV$row,
organism= "hsapiens",
exclude_iea=TRUE,
ordered_query=F,
significant=F,
sources=GSEA_SOURCES)
# Change query name
gost_HPV$result$query <- "DE HPV related genes vs genome"
#Plot and table of significant results
gostplot(gost_HPV ,capped = F, interactive = T)
Volcano plot
volcano <- mRNA_res
LFC.TRESH <- 1
PVAL.TRESH <- 0.05
# Building volcano plot
volcano <- as.data.frame(mRNA_res) %>%
dplyr::filter(!is.na(padj)) %>% #filtering NA's
dplyr::mutate(diffexpressed = dplyr::case_when(
padj < PVAL.TRESH & log2FoldChange > LFC.TRESH ~ "UP",
padj < PVAL.TRESH & log2FoldChange < -LFC.TRESH ~ "DOWN",
TRUE ~ "NO"))
# Getting hgnc symbols for plotting
volcano$hgnc_symbol <- idMap(rownames(volcano))
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "output length is different than input, check df manualy"
# Volcano colors
volcano_cols <- c("red", "gray60", "green")
# p value tresh for delabeling
p_volcano_tresh <- arrange(volcano,padj)[12, "padj"]
volcano <- volcano %>%
mutate(delabel = ifelse(padj <= p_volcano_tresh,
hgnc_symbol,
NA))
ggplot(data=as.data.frame(volcano),
aes(x=log2FoldChange,
y=-log10(padj),
col=diffexpressed))+
geom_point()+
theme_minimal()+
scale_color_manual(values = volcano_cols)
ggplot(data=as.data.frame(volcano),
aes(x=log2FoldChange,
y=-log10(padj),
col=diffexpressed))+
geom_point()+
theme_minimal()+
scale_color_manual(values = volcano_cols)+
geom_text_repel(aes(label = delabel), show.legend = FALSE,
size =3, max.overlaps = 50)
## Warning: Removed 16560 rows containing missing values (geom_text_repel).
Counts plot
Visualizing counts for genes
# Using the list of genes that are labeled in volcano plot
genes <- rownames(filter(volcano, !is.na(delabel)))
# Building an object for multiple counts plot
op <- par(mfrow = c(3,4),
oma = c(2,2,0,0) + 1,
mar = c(0,0,1,1) + 1.2)
# Plotting
for(gene in genes){
plotCounts(mRNA_dds, gene = gene, intgroup = c("Hist"),
returnData = F, pch = 19, main=idMap(gene))
}
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
## [1] "IDTABLE reading from filename: /home/katarina/Projects/HNSCC/scripts/IDTABLE_hg19.rds"
par(op)
Raw sequencing data was previously analyzed and published. Since it was approved this notebook will start with a count matrix that was mapped with smallRNA Basepace application. It was also provided by Sabol and is located in the cloud linked in the mRNA Raw data section.
The matrix contains only microRNA counts. For input the counts table was modified. The modification included deleting the second row which contains the information about the sample group and the column names were renamed to an unique id (eg. from Grce_17trimmed to RNA3T). Also, a 0 was added for the first column to denote an column for rownames. The modified file is miR_counts_matrix.txt
# count matrix
mir_countdata <- read.csv("inputs/miR_counts_matrix.txt", row.names=1)
# coldata
mir_coldata <- read.csv("inputs/miR_sample_table.csv",
row.names = 1,
sep = ";",
stringsAsFactors = FALSE)
#coldata$HPV <- as.factor(coldata$HPV)
#setting an names variable that is used in other DESeq2 functions
mir_coldata$names <- mir_coldata$Sample.ID
Excluding samples
Here we exclude samples with sampleID KBD9 and KBD25 since they are outliers as per paper Božinović et al.
keep <- mir_coldata$names != "KBD9" & mir_coldata$names != "KBD25"
mir_coldata <- mir_coldata [keep,]
keep <- colnames(mir_countdata) != "KBD9" & colnames(mir_countdata) != "KBD25"
mir_countdata <- mir_countdata [,keep]
Creating a dds object The aim of this experiment is to explore differentially expressed microRNA genes. From this data we can explore difference between cancer and normal, HPV pos and neg samples and different tissue samples. This notebook demonstrates differential expression analysis on the design Tissue + HPV.
mir_dds <- DESeqDataSetFromMatrix(countData = mir_countdata,
colData = mir_coldata,
design = ~Hist + Tissue + HPV)
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
# filtering low count rows
dim(mir_dds)
## [1] 5789 20
keep <- rowSums(counts(mir_dds)) > 10
mir_dds <- mir_dds[keep,]
dim(mir_dds)
## [1] 3706 20
Variance transformation is a statistical method for multidimensional data so it becomes homoskedastic. This data can be analyzed through clustering and principal component analysis.
rlog transformation Regularized logartihm (rlog) transformation behaves similarly to the log2 transformation. It is slower than the VST but recomended for small datasets (n<30).
mir_rld <- rlog(mir_dds)
Sample distance To asses overall similarity between samples we calculate sample distance using the R function dist to calculate the Euclidean distance.
Triangle sample distance heatmap
mir_annot_row <- as.data.frame(colData(mir_rld)[,c("Hist","Tissue", "HPV")])
triangleHeatMap(assay(mir_rld),
color = "Reds",
annot_row = mir_annot_row,
triange.diagonal = F)
# selecting data for plotting pca
mRNA_pcadata <-
DESeq2::plotPCA(mir_rld,
intgroup = c("Hist", "HPV", "Tissue"),
returnData = TRUE)
# selecting variance
percentVar <- round(100 * attr(mRNA_pcadata, "percentVar"))
# plotting pca data
ggplot(mRNA_pcadata, aes(x = PC1, y = PC2, color = HPV, shape = Hist)) +
geom_point(size =3) +
xlab(paste0("PC1: ", percentVar[1], "% variance")) +
ylab(paste0("PC2: ", percentVar[2], "% variance")) +
theme(panel.grid = element_blank(),
panel.background = element_rect(fill = "white"),
panel.border = element_rect(fill = "transparent"))+
coord_fixed() +
ggtitle("PCA with rlog data")+
scale_color_manual(values = c("firebrick1","forestgreen","blue1"))+
geom_text_repel(aes(label = colData(mir_rld)$Sample.ID), size =3)
The call to DESeq function preforms the estimation of size factors, estimation of dispersion values and fits a generalized linear model.
mir_dds <- DESeq(mir_dds, betaPrior = T)
Differential expression on cancer control group
mir_res <- results(mir_dds, contrast = c("Hist","cancer","normal"))
mir_res
## log2 fold change (MAP): Hist cancer vs normal
## Wald test p-value: Hist cancer vs normal
## DataFrame with 3706 rows and 6 columns
## baseMean log2FoldChange lfcSE
## <numeric> <numeric> <numeric>
## hsa-let-7a-5p 4.01521e+04 0.788898 0.315919
## hsa-let-7a-3p 3.51211e+01 0.727370 0.488667
## hsa-let-7a-2-3p 3.72577e-01 -0.938634 1.299768
## hsa-let-7b-5p 4.77149e+03 1.072733 0.436635
## hsa-let-7c-5p 2.61070e+03 -0.577264 0.691565
## ... ... ... ...
## 9_TCTGATCAGGCAAAATTGCAGA_hsa-mir-4772 1.531377 1.3809634 1.14758
## 9_TGCAGGAACTTGTGAGTCTC_hsa-mir-873 1.348568 0.5957587 1.31484
## 9_TGCAGGAACTTGTGAGTCTCC_hsa-mir-873 2.164927 1.4813945 1.25393
## 9_TGCAGGAACTTGTGAGTCTCCT_hsa-mir-873 2.815976 0.8249337 1.25966
## 9_TGCCCCAACAAGGAAGGACA_hsa-mir-5699 0.973761 -0.0321778 1.08451
## stat pvalue padj
## <numeric> <numeric> <numeric>
## hsa-let-7a-5p 2.497153 0.0125195 0.0509186
## hsa-let-7a-3p 1.488479 0.1366247 0.2753381
## hsa-let-7a-2-3p -0.722155 0.4701992 NA
## hsa-let-7b-5p 2.456819 0.0140173 0.0553042
## hsa-let-7c-5p -0.834722 0.4038745 0.5770754
## ... ... ... ...
## 9_TCTGATCAGGCAAAATTGCAGA_hsa-mir-4772 1.2033737 0.228832 NA
## 9_TGCAGGAACTTGTGAGTCTC_hsa-mir-873 0.4531023 0.650475 NA
## 9_TGCAGGAACTTGTGAGTCTCC_hsa-mir-873 1.1814031 0.237443 NA
## 9_TGCAGGAACTTGTGAGTCTCCT_hsa-mir-873 0.6548868 0.512541 0.670913
## 9_TGCCCCAACAAGGAAGGACA_hsa-mir-5699 -0.0296704 0.976330 NA
Some summary statistic
summary(mir_res, alpha=0.05)
##
## out of 3706 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 290, 7.8%
## LFC < 0 (down) : 258, 7%
## outliers [1] : 43, 1.2%
## low counts [2] : 1422, 38%
## (mean count < 3)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
Filtering BH adjusted pvalue results
mir_DE <- filter(as.data.frame(mir_res), padj < 0.05)
summary(mir_DE)
## baseMean log2FoldChange lfcSE stat
## Min. : 2.66 Min. :-4.8585 Min. :0.2554 Min. :-6.1428
## 1st Qu.: 17.43 1st Qu.:-1.9653 1st Qu.:0.5223 1st Qu.:-3.2635
## Median : 60.70 Median : 1.4146 Median :0.6422 Median : 2.5797
## Mean : 962.57 Mean : 0.3375 Mean :0.6735 Mean : 0.3407
## 3rd Qu.: 300.83 3rd Qu.: 2.5776 3rd Qu.:0.7872 3rd Qu.: 3.4037
## Max. :95650.32 Max. : 5.3210 Max. :1.3206 Max. : 8.5263
## pvalue padj
## Min. :0.0000000 Min. :0.000000
## 1st Qu.:0.0001065 1st Qu.:0.001731
## Median :0.0008250 Median :0.006730
## Mean :0.0024880 Mean :0.012903
## 3rd Qu.:0.0037716 3rd Qu.:0.020552
## Max. :0.0121402 Max. :0.049646
hist(mir_DE$log2FoldChange, border="white", col="darkgreen")
hist(log(mir_DE$baseMean), border="white", col="darkblue")
HPV effect on cancer
In the next chunk explores the effect of HPV on cancer. To be more precise, it explores which genes are DE in HPV in cancer. Firstly compare HPV - and controls to get a list of genes that re DE in cancer samples. Secondly, compare HPV + and HPV - samples and exclude genes from the previous comparison.
# making another dds object with different design
mir_dds_HPV <- mir_dds
# New column in colData for design
colData(mir_dds_HPV)$HistHPV <-
as.factor(paste0(colData(mir_dds_HPV)$Hist,colData(mir_dds_HPV)$HPV))
# setting the HistHPV des
design(mir_dds_HPV) =~ HistHPV
# Running the DESeq function
mir_dds_HPV <- DESeq(mir_dds_HPV, betaPrior = T)
# Contrasting between negative HPV and controls
res.cN.nN <-
results(mir_dds_HPV, contrast=c("HistHPV","cancerN","normalN"), tidy = T)
# Contrasting between HPV negative and positive
res.cP.cN <-
results(mir_dds_HPV, contrast=c("HistHPV","cancerP","cancerN"), tidy = T)
# selecting significant results
res.cP.cN <- dplyr::filter(res.cP.cN, padj < 0.1)
res.cN.nN <- dplyr::filter(res.cN.nN, padj < 0.1)
# excluding the cancer infulence
mir_HPV <- res.cP.cN[!is.element(res.cP.cN$row, res.cN.nN$row),]
mir_HPV
## row baseMean log2FoldChange lfcSE
## 2 hsa-miR-106a-5p 19.477681 2.101696 0.5436871
## 3 hsa-miR-223-3p 185.415712 -1.197091 0.3587898
## 4 hsa-miR-9-5p 412.771663 3.044664 0.6038572
## 5 hsa-miR-9-3p 4.586593 2.336957 0.6819823
## 7 hsa-miR-20b-5p 32.034451 2.376377 0.6252002
## 8 hsa-miR-512-3p 1.512275 -2.561579 0.7777937
## 9 15_TCTTTGGTTATCTAGCTGTAT_hsa-mir-9-1 7.596875 4.434122 0.7640074
## 10 15_TCTTTGGTTATCTAGCTGTAT_hsa-mir-9-2 7.724201 3.370782 0.6892955
## 11 15_TCTTTGGTTATCTAGCTGTAT_hsa-mir-9-3 7.579966 2.942359 0.6781000
## 12 15_TCTTTGGTTATCTAGCTGTATG_hsa-mir-9-1 13.985244 2.849790 0.6413319
## 13 15_TCTTTGGTTATCTAGCTGTATG_hsa-mir-9-2 14.678740 3.270937 0.6932560
## 14 15_TCTTTGGTTATCTAGCTGTATG_hsa-mir-9-3 14.331517 3.216853 0.6763908
## 15 15_TTATGGTTTGCCTGGGACTGA_hsa-mir-584 33.754315 -1.927966 0.4920134
## 21 49_AATTGCACGGTATCCATCTGTAT_hsa-mir-363 37.467233 2.478019 0.5739185
## 22 5_CAAAGTGCTCATAGTGCAGGTA_hsa-mir-20b 3.705624 2.248063 0.6636685
## 23 5_CAAAGTGCTCATAGTGCAGGTAGA_hsa-mir-20b 4.186673 2.487489 0.7442990
## 24 5_CAAAGTGCTCATAGTGCAGGTAGT_hsa-mir-20b 11.510466 2.657850 0.7243385
## 25 50_ATTGCACGGTATCCATCTGT_hsa-mir-363 18.582416 1.925582 0.5581465
## 26 50_ATTGCACGGTATCCATCTGTAA_hsa-mir-363 10.831970 2.460293 0.5737673
## 27 50_ATTGCACGGTATCCATCTGTAT_hsa-mir-363 10.806923 2.391778 0.6827111
## stat pvalue padj
## 2 3.865635 1.108006e-04 2.476393e-02
## 3 -3.336469 8.485003e-04 9.155025e-02
## 4 5.042026 4.606279e-07 7.206524e-04
## 5 3.426712 6.109366e-04 8.007564e-02
## 7 3.800985 1.441221e-04 3.006386e-02
## 8 -3.293391 9.898680e-04 9.991281e-02
## 9 5.803769 6.484062e-09 2.028863e-05
## 10 4.890184 1.007420e-06 1.050739e-03
## 11 4.339123 1.430527e-05 4.932882e-03
## 12 4.443550 8.848675e-06 4.614584e-03
## 13 4.718224 2.379125e-06 1.488857e-03
## 14 4.755909 1.975550e-06 1.488857e-03
## 15 -3.918523 8.909336e-05 2.144409e-02
## 21 4.317718 1.576504e-05 4.932882e-03
## 22 3.387327 7.057715e-04 8.493688e-02
## 23 3.342056 8.316018e-04 9.155025e-02
## 24 3.669348 2.431697e-04 4.670797e-02
## 25 3.449959 5.606721e-04 8.007564e-02
## 26 4.287963 1.803194e-05 5.129266e-03
## 27 3.503354 4.594393e-04 7.566240e-02
Volcano plot
volcano <- mir_res
LFC.TRESH <- 1
PVAL.TRESH <- 0.05
# Building volcano plot
volcano <- as.data.frame(mir_res) %>%
dplyr::filter(!is.na(padj)) %>% #filtering NA's
dplyr::mutate(diffexpressed = dplyr::case_when(
padj < PVAL.TRESH & log2FoldChange > LFC.TRESH ~ "UP",
padj < PVAL.TRESH & log2FoldChange < -LFC.TRESH ~ "DOWN",
TRUE ~ "NO"))
# Volcano colors
volcano_cols <- c("red", "gray60", "green")
# p value tresh for delabeling
p_volcano_tresh <- arrange(volcano,padj)[12, "padj"]
volcano <- volcano %>%
mutate(delabel = ifelse(padj <= p_volcano_tresh,
rownames(volcano),
NA))
ggplot(data=as.data.frame(volcano),
aes(x=log2FoldChange,
y=-log10(padj),
col=diffexpressed))+
geom_point()+
theme_minimal()+
scale_color_manual(values = volcano_cols)
ggplot(data=as.data.frame(volcano),
aes(x=log2FoldChange,
y=-log10(padj),
col=diffexpressed))+
geom_point()+
theme_minimal()+
scale_color_manual(values = volcano_cols)+
geom_text_repel(aes(label = delabel), show.legend = FALSE,
size =3, max.overlaps = 50)
## Warning: Removed 2229 rows containing missing values (geom_text_repel).
Counts plot
Visualizing counts for genes
# Using the list of genes that are labeled in volcano plot
genes <- rownames(filter(volcano, !is.na(delabel)))
# Building an object for multiple counts plot
op <- par(mfrow = c(3,4),
oma = c(2,2,0,0) + 1,
mar = c(0,0,1,1) + 1.2)
# Plotting
for(gene in genes){
plotCounts(mir_dds, gene = gene, intgroup = c("Hist"),
returnData = F, pch = 19)
}
par(op)
Raw DNAm data was previously published in Gasperov paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7554960/ Raw data was preprocessed with Champ tool - normalization and batch effect removal. For simplicity preprocesed Rdata was loaded in this notebook. Data was provided from Sabol.
For this analysis only cancer samples that have available microRNA and mRNA data will be considered. Since there is no methylation data for control samples of which there are microRNA and mRNA data, other control samples will be used for all analysis.
Methylation data is saved in an object from the RnBeads class.
load("inputs/rnb.set_Combat.RData")
# Renaming the rnb set
rnb_set <- rnb.set_Combat
rm(rnb.set_Combat)
This methylation dataset contains data about lesions which was not considered in this research. Lesion samples are removed here. Furthermore, additional phenotype data is added here. The xlsx table was downloaded from cloud and is called additional_columns_impute.xlsx and renamed to DNAm_additional_pheno.xlsx
# Removing lesions that are not included in the analysis
rnb_set <-
remove.samples(rnb_set,
samplelist = (pheno(rnb_set)$Sample_group == "Lesion"))
dnam_pheno <-
read_xlsx("inputs/DNAm_additional_pheno.xlsx", range = cell_cols("A:M"))
# Filtering out Lesions and renaming HPV values to P and N
dnam_pheno <-
dnam_pheno %>%
filter(Sample_group != "Lesion") %>%
mutate(HPV = ifelse(HPV == 1 , "P", "N"))
# Adding HPV and other phenotype data
rnb_set <- addPheno(rnb_set, dnam_pheno$HPV, "HPV")
rnb_set <- addPheno(rnb_set, dnam_pheno$Gender, "Gender")
rnb_set <- addPheno(rnb_set, dnam_pheno$AgeGroup, "Age_group")
For DNAm data there are multiple regions that could be taken into consideration. Some of the standards are CpG sites that evaluate individual CpG’s, promoters (-2000, 500 from TSS), gene (gene body) and CpG islands (CGI are ranges with higher frequency of CpG’s that are previously annotated and are known).
PCA plot PCA plots could be generated over multiple regions. Individual CpG is computationaly expensive to calculate so only PCA on CGI is preformed
# Preforming dimension reduction on CpG islands
dred_islands <- rnb.execute.dreduction(rnb_set, target ="cpgislands")
# Building dnam_pca_data table for plotting
dnam_pca_data <- as.data.frame(dred_islands$pca$x[,c("PC1","PC2")])
# Additional annotation data
dnam_pca_data <-
cbind(dnam_pca_data,
dplyr::select(as.data.frame(pheno(rnb_set)),
Sample_group, HPV, Gender, Age_group))
ggplot(dnam_pca_data,aes(x=PC1,y=PC2, color=Sample_group, shape = HPV))+
geom_point(data = dnam_pca_data, size=4)+
ggtitle("PCA plot - CpG islands")+
theme(panel.grid = element_blank(),
panel.background = element_rect(fill = "white"),
panel.border = element_rect(fill = "transparent"))+
geom_text_repel(aes(label = rownames(dnam_pca_data)), size =3)
Sample distance
To asses overall similarity between samples we calculate sample distance using the R function dist to calculate the Euclidean distance.
Triangle sample distance heatmap
# dataframe with annotation rows for triangle heatmap
dnam_annot_row <- dplyr::select(as.data.frame(pheno(rnb_set)),
Sample_group, HPV, Gender, Age_group)
# Annotation dfs need to have samples as rownames
rownames(dnam_annot_row) <- pheno(rnb_set)$Sample_ID
# Call the local function
triangleHeatMap(meth(rnb_set),
color = "Greens",
annot_row = dnam_annot_row)
Differential methylation can be calculated for specific sites or genomic regions. Since promoter and gene (gene body) regions are arbitrary picked ranges, CpG islands regions are used for differential methylation analysis. To see meaning of each column look at RnBeads vignette. For most of the analysis mean.mean.diff was taken into consideration because the value is similar as fold change in sequencing data.
diff_meth <- rnb.execute.computeDiffMeth(rnb_set,
pheno.cols = "Sample_group",
region.types = "cpgislands")
## 2021-11-03 11:40:51 3.3 STATUS STARTED Retrieving comparison info
## 2021-11-03 11:40:51 3.3 STATUS COMPLETED Retrieving comparison info
##
## 2021-11-03 11:40:51 3.3 STATUS STARTED Computing differential methylation tables
## 2021-11-03 11:40:51 3.3 STATUS STARTED Comparing: Cancer vs. Normal (based on Sample_group)
## 2021-11-03 11:40:51 3.3 STATUS STARTED Computing Differential Methylation Table
## 2021-11-03 11:40:52 3.2 INFO Conducting differential analysis using limma
## 2021-11-03 11:40:56 3.9 STATUS COMPLETED Computing Differential Methylation Table
## 2021-11-03 11:40:56 4.0 STATUS STARTED Computing Differential Methylation Tables (Region Level)
## 2021-11-03 11:41:10 3.2 STATUS Computed table for cpgislands
## 2021-11-03 11:41:10 3.2 STATUS COMPLETED Computing Differential Methylation Tables (Region Level)
## 2021-11-03 11:41:10 3.2 STATUS COMPLETED Comparing: Cancer vs. Normal (based on Sample_group)
## 2021-11-03 11:41:10 3.2 STATUS COMPLETED Computing differential methylation tables
# Diff methylation table for CpG islands
dnam_res <- get.table(diff_meth,
get.comparisons(diff_meth)[1],
"cpgislands",
return.data.frame =T)
# Diff methylation table for CpG sites, used for volcano plot
dnam_res_sites <- get.table(diff_meth,
get.comparisons(diff_meth)[1],
"sites",
return.data.frame =T)
head(dnam_res)
## mean.mean.g1 mean.mean.g2 mean.mean.diff mean.mean.quot.log2 comb.p.val
## 1 0.023044587 0.023533981 -0.0004893938 -0.1369526 0.099288297
## 2 0.033497661 0.024413333 0.0090843283 0.3374428 0.069531360
## 3 0.295863132 0.269334721 0.0265284110 0.1750961 0.123067217
## 4 0.014025619 0.024321920 -0.0102963011 -0.7440855 0.057410806
## 5 0.007239818 0.008961348 -0.0017215307 -0.1373170 0.402870410
## 6 0.329051525 0.272994051 0.0560574747 0.3087724 0.004687685
## comb.p.adj.fdr num.sites mean.num.na.g1 mean.num.na.g2 combinedRank
## 1 0.13005057 2 0 0 23798
## 2 0.10156993 2 0 0 16924
## 3 0.15221326 3 0 0 19989
## 4 0.08993014 3 0 0 15783
## 5 0.42145158 1 0 0 23633
## 6 0.01697826 6 0 0 12393
Summary statistic of significant cpg island differential methylation.
summary(filter(dnam_res, comb.p.adj.fdr < 0.05))
## mean.mean.g1 mean.mean.g2 mean.mean.diff
## Min. :0.0000295 Min. :0.0004438 Min. :-0.769649
## 1st Qu.:0.0571093 1st Qu.:0.0357609 1st Qu.:-0.079607
## Median :0.1969202 Median :0.0912668 Median : 0.001330
## Mean :0.3064860 Mean :0.3116314 Mean :-0.005145
## 3rd Qu.:0.5428289 3rd Qu.:0.6654861 3rd Qu.: 0.072257
## Max. :0.9798466 Max. :0.9968258 Max. : 0.637812
## mean.mean.quot.log2 comb.p.val comb.p.adj.fdr num.sites
## Min. :-5.06726 Min. :0.000000 Min. :0.000000 Min. : 1.000
## 1st Qu.:-0.44404 1st Qu.:0.000153 1st Qu.:0.001384 1st Qu.: 3.000
## Median :-0.08196 Median :0.001966 Median :0.008894 Median : 4.000
## Mean : 0.14058 Mean :0.005250 Mean :0.015202 Mean : 5.189
## 3rd Qu.: 0.70880 3rd Qu.:0.008985 3rd Qu.:0.027093 3rd Qu.: 6.000
## Max. : 4.53599 Max. :0.022096 Max. :0.049970 Max. :74.000
## mean.num.na.g1 mean.num.na.g2 combinedRank
## Min. :0 Min. :0 Min. : 18
## 1st Qu.:0 1st Qu.:0 1st Qu.: 6123
## Median :0 Median :0 Median :10954
## Mean :0 Mean :0 Mean :11566
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:16850
## Max. :0 Max. :0 Max. :24722
Global DNA hypomethylation was observed in most cancer tissues and hypermethylation of certain regions. Since hypomethylation happens at the level of the genome it is hard to pinpoint if there are direct oncogenic effect or is it a secondary result of malignancy. However , DNA hypermethylation on CpG rich locations named CpG islands which may affect gene transcirption or chromatin structure. Here we explore influecne of differential methylation of cpg islands in promoter regions.
Making GRanges for cpg island data then finding overlaps within hg19 promoter regions. Promoter regions are defined as -2000 and +500 bases from the transcription start site (TSS).
# Combining dnam res data with granges data
dnam_res_cgi <-
data.frame(dplyr::select(as.data.frame(annotation(rnb_set, "cpgislands")),
Chromosome,Start,End,Strand),
dplyr::select(as.data.frame(dnam_res),
log.quot="mean.mean.quot.log2",
pvalue="comb.p.val",
padj="comb.p.adj.fdr",
num.sites))
# Making Granges object
dnam_res_cgi <-
makeGRangesFromDataFrame(dnam_res_cgi,keep.extra.columns=TRUE)
# Getting promoter ranges with gene names
promoter_ranges <- unlist(rnb.get.annotation(type="promoters", assembly="hg19"),
use.names = F) %>%
dplyr::select(symbol)
# Transfering ensembl genes to a column so that it does not lose in join overlaps
promoter_ranges$ensembl <- names(promoter_ranges)
# Overlaping cgi with promoter of genes
dnam_res_cgi <-
join_overlap_left(dnam_res_cgi, promoter_ranges)
Volcano plot
Dodati neki tekst tu
LFC.TRESH <- 0
PVAL.TRESH <- 0.05
volcano_cols <- c("green","red", "gray60")
# volcano data frame
volcano <- data.frame(values(dnam_res_cgi)) %>%
dplyr::mutate(diffexpressed = dplyr::case_when(
padj < PVAL.TRESH & log.quot > LFC.TRESH ~ "Hypermethylated",
padj < PVAL.TRESH & log.quot < -LFC.TRESH ~ "Hypomethylated",
TRUE ~ "Not significant"))
ggplot(data=as.data.frame(volcano),
aes(x=log.quot,
y=-log10(padj),
col=diffexpressed))+
geom_point()+
theme_minimal()+
scale_color_manual(values = volcano_cols)
table(volcano$diffexpressed)
##
## Hypermethylated Hypomethylated Not significant
## 5457 6557 16817
Since there are too many points on the volcano plot for differential methylation on cpg sites a histogram is used to visualize that most of cpg sites are hypomethylated. CpGs with padj < 0.05 are used for construction of histogram
# Volcano data frame
dnam_res_sites <-
dplyr::select(dnam_res_sites,
log.quot="mean.quot.log2",
pvalue="diffmeth.p.val",
padj="diffmeth.p.adj.fdr")
hist(dplyr::filter(dnam_res_sites, padj < 0.05)$log.quot,
main="Histogram of CpG sites with padj < 0.05",
xlab="log.quot",
col="darkgreen",
border="white")
GSEA analysis
Preforming gsea analysis on genes that have differentially methylated cpg islands in promoters. Number of genes are around 5700. KEGG pathways like Pathways in cancer, Breast cancer, Rap1 signaling pathway
# Using only part of sources
GSEA_SOURCES <- c("GO:BP","CORUM","KEGG","REAC","WP","MIRNA")
gost_DNAm <-
gost(filter(dnam_res_cgi, padj < 0.05 & !is.na(ensembl))$ensembl,
organism= "hsapiens",
exclude_iea=TRUE,
domain_scope="annotated",
ordered_query=F,
sources=GSEA_SOURCES)
# Change query name
gost_DNAm$result$query <- "CGI"
#Plot and table of significant results
gostplot(gost_DNAm ,capped = F, interactive = T)
sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=hr_HR.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=hr_HR.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=hr_HR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=hr_HR.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] grid stats4 parallel stats graphics grDevices utils
## [8] datasets methods base
##
## other attached packages:
## [1] biomaRt_2.48.3
## [2] dplyr_1.0.7
## [3] plyranges_1.12.1
## [4] readxl_1.3.1
## [5] RnBeads.hg19_1.24.0
## [6] RnBeads_2.10.0
## [7] plyr_1.8.6
## [8] methylumi_2.38.0
## [9] minfi_1.38.0
## [10] bumphunter_1.34.0
## [11] locfit_1.5-9.4
## [12] iterators_1.0.13
## [13] foreach_1.5.1
## [14] Biostrings_2.60.2
## [15] XVector_0.32.0
## [16] FDb.InfiniumMethylation.hg19_2.2.0
## [17] org.Hs.eg.db_3.13.0
## [18] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
## [19] reshape2_1.4.4
## [20] scales_1.1.1
## [21] illuminaio_0.34.0
## [22] limma_3.48.3
## [23] gridExtra_2.3
## [24] gplots_3.1.1
## [25] fields_12.5
## [26] viridis_0.6.2
## [27] viridisLite_0.4.0
## [28] spam_2.7-0
## [29] dotCall64_1.0-1
## [30] ff_4.0.4
## [31] bit_4.0.4
## [32] cluster_2.1.2
## [33] MASS_7.3-54
## [34] gprofiler2_0.2.1
## [35] ggrepel_0.9.1
## [36] ggplot2_3.3.5
## [37] RColorBrewer_1.1-2
## [38] pheatmap_1.0.12
## [39] DESeq2_1.32.0
## [40] SummarizedExperiment_1.22.0
## [41] MatrixGenerics_1.4.3
## [42] matrixStats_0.61.0
## [43] tximport_1.20.0
## [44] stringr_1.4.0
## [45] GenomicFeatures_1.44.2
## [46] AnnotationDbi_1.54.1
## [47] Biobase_2.52.0
## [48] GenomicRanges_1.44.0
## [49] GenomeInfoDb_1.28.4
## [50] IRanges_2.26.0
## [51] S4Vectors_0.30.2
## [52] BiocGenerics_0.38.0
## [53] BiocManager_1.30.16
## [54] rmarkdown_2.11
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.2 tidyselect_1.1.1
## [3] RSQLite_2.2.8 htmlwidgets_1.5.4
## [5] BiocParallel_1.26.2 munsell_0.5.0
## [7] codetools_0.2-18 preprocessCore_1.54.0
## [9] withr_2.4.2 colorspace_2.0-2
## [11] filelock_1.0.2 highr_0.9
## [13] knitr_1.36 rstudioapi_0.13
## [15] labeling_0.4.2 GenomeInfoDbData_1.2.6
## [17] farver_2.1.0 bit64_4.0.5
## [19] rhdf5_2.36.0 vctrs_0.3.8
## [21] generics_0.1.1 xfun_0.27
## [23] BiocFileCache_2.0.0 R6_2.5.1
## [25] bitops_1.0-7 rhdf5filters_1.4.0
## [27] cachem_1.0.6 reshape_0.8.8
## [29] DelayedArray_0.18.0 assertthat_0.2.1
## [31] vroom_1.5.5 BiocIO_1.2.0
## [33] gtable_0.3.0 rlang_0.4.12
## [35] genefilter_1.74.1 splines_4.1.1
## [37] rtracklayer_1.52.1 lazyeval_0.2.2
## [39] GEOquery_2.60.0 yaml_2.2.1
## [41] crosstalk_1.1.1 tools_4.1.1
## [43] nor1mix_1.3-0 ellipsis_0.3.2
## [45] jquerylib_0.1.4 siggenes_1.66.0
## [47] Rcpp_1.0.7 sparseMatrixStats_1.4.2
## [49] progress_1.2.2 zlibbioc_1.38.0
## [51] purrr_0.3.4 RCurl_1.98-1.5
## [53] prettyunits_1.1.1 openssl_1.4.5
## [55] magrittr_2.0.1 data.table_1.14.2
## [57] hms_1.1.1 evaluate_0.14
## [59] xtable_1.8-4 XML_3.99-0.8
## [61] mclust_5.4.7 compiler_4.1.1
## [63] tibble_3.1.5 maps_3.4.0
## [65] KernSmooth_2.23-20 crayon_1.4.1
## [67] htmltools_0.5.2 tzdb_0.2.0
## [69] tidyr_1.1.4 geneplotter_1.70.0
## [71] DBI_1.1.1 dbplyr_2.1.1
## [73] rappdirs_0.3.3 Matrix_1.3-4
## [75] readr_2.0.2 cli_3.1.0
## [77] quadprog_1.5-8 pkgconfig_2.0.3
## [79] GenomicAlignments_1.28.0 plotly_4.10.0
## [81] xml2_1.3.2 annotate_1.70.0
## [83] bslib_0.3.1 rngtools_1.5.2
## [85] multtest_2.48.0 beanplot_1.2
## [87] doRNG_1.8.2 scrime_1.3.5
## [89] digest_0.6.28 base64_2.0
## [91] cellranger_1.1.0 DelayedMatrixStats_1.14.3
## [93] restfulr_0.0.13 curl_4.3.2
## [95] Rsamtools_2.8.0 gtools_3.9.2
## [97] rjson_0.2.20 lifecycle_1.0.1
## [99] nlme_3.1-153 jsonlite_1.7.2
## [101] Rhdf5lib_1.14.2 askpass_1.1
## [103] fansi_0.5.0 pillar_1.6.4
## [105] lattice_0.20-45 KEGGREST_1.32.0
## [107] fastmap_1.1.0 httr_1.4.2
## [109] survival_3.2-13 glue_1.4.2
## [111] png_0.1-7 stringi_1.7.5
## [113] sass_0.4.0 HDF5Array_1.20.0
## [115] blob_1.2.2 caTools_1.18.2
## [117] memoise_2.0.0